This report provides an in-depth analysis of the Materials Project batteries dataset. It includes an overview of the dataset, covering its size, column descriptions, and any missing values. Additionally, the report features various visualizations, predictive modeling, and an analysis of current trends in the battery industry.
Dominance of Lithium Batteries: Lithium batteries are the largest representative, outnumbering the second largest group by more than five times. Further analysis highlights their widespread use due to robust stability, high energy density, and minimal sensitivity to volume changes throughout their lifespan.
Aluminum as a Promising Alternative: While lithium batteries dominate the dataset, aluminum batteries exhibit similar qualities and, in some cases, even surpass lithium’s performance — particularly in volumetric capacity. However, aluminum batteries tend to experience greater volume changes and instability. The potential advantages of aluminum batteries make them a promising alternative; despite their trade-offs, they deserve further attention.
library(dplyr)
library(readr)
library(ggplot2)
library(kableExtra)
library(gridExtra)
library(purrr)
library(corrplot)
library(tidyr)
library(plotly)
library(RColorBrewer)
library(DT)
library(caret)
library(randomForest)
library(factoextra)
library(fpc)
library(dbscan)
| Attribute | Description |
|---|---|
| Battery ID | Identifier of the battery. |
| Battery Formula | Chemical formula of the battery material. |
| Working Ion | Primary ion responsible for charge transport in the battery. |
| Formula Charge | Chemical formula of the battery material in the charged state. |
| Formula Discharge | Chemical formula of the battery material in the discharged state. |
| Battery ID | Battery Formula | Working Ion | Formula Charge | Formula Discharge | |
|---|---|---|---|---|---|
| Length:4351 | Length:4351 | Length:4351 | Length:4351 | Length:4351 | |
| Class :character | Class :character | Class :character | Class :character | Class :character | |
| Mode :character | Mode :character | Mode :character | Mode :character | Mode :character |
| Attribute | Description |
|---|---|
| Max Delta Volume | Change in volume (%) for a given voltage step using the formula: max(charge, discharge)/min(charge, discharge) - 1. |
| Average Voltage | Average voltage for each voltage step. |
| Gravimetric Capacity | Gravimetric capacity, or energy per unit mass (mAh/g). |
| Volumetric Capacity | Volumetric capacity, or energy per unit volume (mAh/cm³). |
| Gravimetric Energy | Gravimetric energy density relative to the battery mass (Wh/kg). |
| Volumetric Energy | Volumetric energy density relative to the battery volume (Wh/L). |
| Atomic Fraction Charge | Atomic fraction of components in the charged state. |
| Atomic Fraction Discharge | Atomic fraction of components in the discharged state. |
| Stability Charge | Stability indicator of the material in the charged state. |
| Stability Discharge | Stability indicator of the material in the discharged state. |
| Steps | Number of distinct voltage steps from fully charged to discharged, based on stable intermediate states. |
| Max Voltage Step | Maximum absolute difference between adjacent voltage steps. |
| Max Delta Volume | Average Voltage | Gravimetric Capacity | Volumetric Capacity | Gravimetric Energy | Volumetric Energy | Atomic Fraction Charge | Atomic Fraction Discharge | Stability Charge | Stability Discharge | Steps | Max Voltage Step | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Min. : 0.00002 | Min. :-7.755 | Min. : 5.176 | Min. : 24.08 | Min. :-583.5 | Min. :-2208.1 | Min. :0.00000 | Min. :0.007407 | Min. :0.00000 | Min. :0.00000 | Min. :1.000 | Min. : 0.0000 | |
| 1st Qu.: 0.01747 | 1st Qu.: 2.226 | 1st Qu.: 88.108 | 1st Qu.: 311.62 | 1st Qu.: 211.7 | 1st Qu.: 821.6 | 1st Qu.:0.00000 | 1st Qu.:0.086957 | 1st Qu.:0.03301 | 1st Qu.:0.01952 | 1st Qu.:1.000 | 1st Qu.: 0.0000 | |
| Median : 0.04203 | Median : 3.301 | Median : 130.691 | Median : 507.03 | Median : 401.8 | Median : 1463.8 | Median :0.00000 | Median :0.142857 | Median :0.07319 | Median :0.04878 | Median :1.000 | Median : 0.0000 | |
| Mean : 0.37531 | Mean : 3.083 | Mean : 158.291 | Mean : 610.62 | Mean : 444.1 | Mean : 1664.0 | Mean :0.03986 | Mean :0.159077 | Mean :0.14257 | Mean :0.12207 | Mean :1.167 | Mean : 0.1503 | |
| 3rd Qu.: 0.08595 | 3rd Qu.: 4.019 | 3rd Qu.: 187.600 | 3rd Qu.: 722.75 | 3rd Qu.: 614.4 | 3rd Qu.: 2252.3 | 3rd Qu.:0.04762 | 3rd Qu.:0.200000 | 3rd Qu.:0.13160 | 3rd Qu.:0.09299 | 3rd Qu.:1.000 | 3rd Qu.: 0.0000 | |
| Max. :293.19322 | Max. :54.569 | Max. :2557.627 | Max. :7619.19 | Max. :5926.9 | Max. :18305.9 | Max. :0.90909 | Max. :0.993333 | Max. :6.48710 | Max. :6.27781 | Max. :6.000 | Max. :26.9607 |
To ensure clarity and accuracy in the following graphs, the function below was used to filter out outliers that could skew the results. Due to the presence of extreme values, removing these outliers ensures that the dataset more accurately reflects typical trends, allowing for a fair comparison across different Working Ion groups.
remove_outliers <- function(x) {
Q1 <- quantile(x, 0.25)
Q3 <- quantile(x, 0.75)
IQR <- Q3 - Q1
x[x >= (Q1 - 1.5 * IQR) & x <= (Q3 + 1.5 * IQR)]
}
| Correlation | Comment |
|---|---|
| Average Voltage & Gravimetric / Volumetric Energy (0.67) / (0.55) | Both gravimetric and volumetric energy correlate with average voltage. Maximizing gravimetric energy boosts ion movement, resulting in higher voltage and energy states. Conversely, maximizing volumetric energy can slow ion movement, leading to lower voltage and increased heat dissipation, reducing usable energy output. |
| Atomic Fraction Charge / Discharge (0.60) | Charging and discharging atomic fractions generally align, ensuring a balanced ion cycle. Minor imbalances could impact stability and longevity. |
| Stability Charge / Discharge (0.80) | High stability correlation between charge and discharge phases enhances cycling reliability, crucial for long-term performance. |
Below, you will find model-based estimates generated by following
parameters: Max Delta Volume, Average Voltage,
Gravimetric Capacity, and Stability Charge in
a controlled progression. This progression simulates potential future
states to help visualize the projected changes in
Gravimetric Energy. Each data point on the plot includes a
tooltip showing the estimated energy along with the related values for
each parameter over a hypothetical range.
Note: Model
training was conducted without outliers to improve prediction accuracy
and reliability
Next four graphs illustrate the impact of each parameter on predicted energy by sequencing the selected variable while holding other variables at their average values.
Among the parameters studied, the trained model indicates that Average Voltage and Gravimetric Capacity have the greatest influence on predicted energy, while Max Delta Volume and Stability Charge show only slight and inconclusive variations in energy output.
set.seed(5643)
sample_data <- batteries_data[sample(nrow(batteries_data), 10), ]
sample_input <- sample_data[, c("Max Delta Volume", "Average Voltage", "Gravimetric Capacity", "Stability Charge")]
sample_data$Predicted_Energy <- predict(model, newdata = sample_input)
result_data <- sample_data[, c("Battery ID", "Gravimetric Energy", "Predicted_Energy")]
knitr::kable(result_data)
| Battery ID | Gravimetric Energy | Predicted_Energy |
|---|---|---|
| mp-756701_Li | 112.98370 | 118.57894 |
| mp-759500_Li | 1013.87490 | 1006.20573 |
| mp-759832_Na | 557.57223 | 557.63370 |
| mp-19395_Li | 251.70507 | 246.12879 |
| mp-510366_K | 217.53634 | 215.47054 |
| mp-25974_Li | 273.43322 | 276.46133 |
| mp-1044783_Zn | 131.82913 | 129.93249 |
| mp-1041066_Mg | 841.78509 | 847.02604 |
| mp-757896_Li | 97.81006 | 83.97572 |
| mp-771696_Li | 311.71147 | 314.62681 |
The model generally predicts energy well, but it is vulnerable to
outliers, such as the case of mp-757896_Li, where the
predicted energy deviates noticeably from the actual value.
| Cluster | Records_in_Cluster | Cluster_Ions | Mean_Gravimetric_Capacity | Mean_Volumetric_Capacity | Mean_Gravimetric_Energy | Mean_Volumetric_Energy | Mean_Atomic_Fraction_Discharge |
|---|---|---|---|---|---|---|---|
| 1 | 4293 | Li, Ca, Mg, Zn, Na, K, Y, Al, Rb, Cs | 144.4513 | 564.8537 | 444.07493 | 1665.04913 | 0.1509745 |
| 2 | 10 | Zn | 688.2406 | 5169.3732 | 23.27523 | 189.88604 | 0.8668478 |
| 3 | 22 | Mg, Al | 1679.2133 | 3660.6992 | -30.24791 | -78.48487 | 0.8824424 |
| 4 | 8 | Zn, Y, Mg | 515.9942 | 3281.0596 | 82.51808 | 566.05303 | 0.7125000 |
The trends reveal a clear preference for batteries with high
Volumetric Energy and low
Atomic Fraction Discharge, likely driven by the growing
demand for compact, high-performance batteries in modern smart devices.
As each device requires a powerful yet space-efficient energy source,
manufacturers are prioritizing designs that maximize energy density,
ensuring longer usage times between charges.
Clusters 3 and 4 overlap, likely due to their similar
Atomic Fraction Discharge values (0.882 vs. 0.866),
suggesting a shared discharge profile despite other differences.